Provenance as Dependency Analysis
نویسندگان
چکیده
Provenance is information recording the source, derivation, or history of some information. Provenance tracking has been studied in a variety of settings, particularly database management systems; however, although many candidate definitions of provenance have been proposed, the mathematical or semantic foundations of data provenance have received comparatively little attention. In this article, we argue that dependency analysis techniques familiar from program analysis and program slicing provide a formal foundation for forms of provenance that are intended to show how (part of) the output of a query depends on (parts of) its input. We introduce a semantic characterization of such dependency provenance for a core database query language, show that minimal dependency provenance is not computable, and provide dynamic and static approximation techniques. We also discuss preliminary implementation experience with using dependency provenance to compute data slices, or summaries of the parts of the input relevant to a given part of the output.
منابع مشابه
A Query Language of Data Provenance Based on Dependency View for Process Analysis
For the scale of data in process keep increasing, data provenance also becomes large and constantly growing, which brings challenges to the efficiency of provenance tracking in process analysis. This paper proposes a kind of dependency view to extract a global data provenance description of the data process instance, and then defines a contextual query language based on dependency view to imple...
متن کاملDerivation Rule Dependency and Data Provenance Semantics
This paper proposes a derivation rule dependency (DRD) to represent data provenance semantics. Data provenances are mostly for tracing data lineage and data creation processes. We propose to treat data provenance semantics as derivation dependencies meta-data. This study is a kind of conceptual one. We are in the process of building a prototype that manages and apply it to real world examples. ...
متن کاملDeclarative Rules for Inferring Fine-Grained Data Provenance from Scientific Workflow Execution Traces
Fine-grained dependencies within scientific workflow provenance specify lineage relationships between a workflow result and the input data, intermediate data, and computation steps used in the result’s derivation. This information is often needed to determine the quality and validity of scientific data, and as such, plays a key role in both provenance standardization efforts and provenance quer...
متن کاملA Fine-Grained Workflow Model with Provenance-Aware Security Views
In this paper we propose a fine-grained workflow model, based on context-free graph grammars, in which the dependency relation between the inputs and outputs of a module is explicitly specified as a bipartite graph. Using this model, we develop an access control mechanism that supports provenance-aware security views. Our security model not only protects sensitive data and modules from unauthor...
متن کاملReconstructing Provenance Preliminary Results - Technical Report
Therefore, we developed a complementary approach, which considers the simpler problem of reconstructing provenance intended as dependencies between entities. The rationale is that once we are able to distinguish dependent entities, it becomes possible to refine the dependency relationships into sequences of operations. In the following, we describe a prototype implementation of this approach an...
متن کامل